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ABSTRACT 


This memo discusses the purpose and practice of a code review, and reviews the 
approaches that found to be most effective at Microsoft. 
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Introduction 


The DOS4 project is the first OS effort to make extensive use of code reviews. We 
have found that good code reviews improves the code quality, not only after the review 
but before, as well. When we started formal code reviews we discovered that there 


put effort into the process. 


This memo discusses the purpose and practice of a code review, and reviews the 
approaches that we have found to be most effective. 


Code Reviews - What and Why 


A code review is the reading of one person’s code by another. This is not normally 
done as face-to-face meeting. instead, the author sends the code to the reviewer and 
the reviewer returns a list of comments. Frequently, the reviewer illustrates the 
comments by editing the code itself. In this case, the edited code is returned and the 
author ‘diffs’ it against the original to extract the changes. 


There are four key goals for a code review: 


1. Guarantee Readability, Understandability and Maintainability 


This is the primary goal of the code review and is simultaneously the hardest and 
easiest to achieve. Simply put, can the reviewer pick up the code and 
immediately start understanding it? Within the first minute or so the reader 
should understand the purpose of the code, its major inputs, outputs, data 
structures and algorithms. The reviewer should, almost immediately, gain a clear 
grasp of just what this package does, to whom, and how. 


At a lower level the code should, by means of comments and lexical layout, 
literally explain itself. The partitioning of the work into subroutines should be 
clear and unambiguous. At any given point in any subroutine the reader should 
be able to quickly understand what is happening, what the control flow is, what 
the register and local variable contents are, etc. 


While reading the code, the reviewer should play the role of a future maintainer 
or debugger. At each point in each program, can the hypothetical maintainer 
understand which values are valid, the meaning of those values, and which 
registers, variables and subroutines may be used? 


I say that this is the easiest goal of a review because the goal of programming is 
not to cause a CPU to produce a desired result. Instead, the goal of programming 
is to encode the functioning of a correct and well understood algorithm. This 
encoding must meet two requirements: it must be transformable, via compiler or 
assembler, into functioning machine instructions, and it must be clearly and 
readily understandable to the engineers who will be working with it, not only the 
original engineer, but the reviewer and maintenance/modification engineers, as 
well. 


The key factor to remember, both as programmer and reviewer, is that what we 
are doing 1s engineering, not writting puzzles. A reader is expected to ‘figure out’ 
a puzzle, but a knowledgeable reader should never have to ‘figure out’ good code, 
whether that ‘figuring out’ be ‘playing compiler/assembler’ because of poor 
lexical style (indentation, parenthesis) or whether it be a full scale ‘safari into the 


waknown', 
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9. Check for adherence to Design and Coding Standards 


This is the second-most important goal of a code review. This step is relatively 
mechanical but never the less very important since the code review is the only 
mechanism in the design cycle which can locate problems here and get them 


corrected. 


3. Review the Design for Appropriateness 


The reviewer evaluates the algorithms to insure that they accomplish the required 
task and will be able to meet performance criterion. For example, a bubble sort 
algorithm used to sort an assembler symbol table would not be appropriate. 


This is the least important of the code review goals because it should be merely a 
formality and a sanity-check; the author should have proven his approach 
appropriate long before a code review. 


4, Look for Bugs and ‘Clumsies’ 


Finding bugs in a program is generally considered the prime goal of a code review. 
It is not and in fact appears low on this priority list. Although not of prime 
importance, finding bugs is certainly a valuable fallout from the code review. The 
time saved by bug detection during the review generally ‘pays’ for the code 
review immediately. 


A ‘clumsy’ is a sequence of code which, although correct, may be rewritten to 
optimize either speed or size or, in some cases, both. The degree of zeal with 
which clumsys are pursued depends upon the particular section of code. If the 
code is infrequently used and is not size critical then readability and 
understandability are the key concerns, not bytes and microseconds. Space and 
time-critical code, especially inner loops, should be scanned carefully for clumsys. 
Of course, the degree of ‘trickiness’ of the proposed replacement sequence must 
correlate to the space/time payback. Naturally, any truly tricky sequences must 


be very clearly documented. 


Most bugs are discovered while pursuing goal #1 above. In addition, a good 
reviewer knows that most, if not all, programs contain bugs. Coupling that 
knowledge with an understanding of the prime bug hiding spots in typical code, 
the reviewer should be able to flush out a few of them. 


Key Stumbling Blocks In Reviews 


The #1 stumbling block to the effectiveness of code reviews is the programmer's ego 
involvement in the code. Its natural for a programmer to see the code as ‘his baby’ 
and to take any alterations to it as criticisms. These perceived ‘criticisms’ are often 
taken as attacks on the programmer rather than as reasonable suggestions for 
improving the program. It is natural... but its wrong. No programmer is born with 
the skills to write code, good or otherwise. We have all learned what we know, 
somewhere, somehow and none of us has learned all there is to know. The key to 
getting good results from a code review is to see it not as personal criticism but as a 
learning situation. 


It is easier to accept ‘learning’ in school because of the formal relationship between the 
instructor, who knows everything, and the student, who knows soles ' least 
) 
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projects to those who are best qualified for them, it is likely that the reviewer knows 
less about this particular speciality than you do. Regardless of the specialized 
knowledge or lack of tt, the reviewer can still point out flaws in technique and mistakes in 
work. It’s a safe bet that football coaches aren't as good at quarterbacking as the 
quarterbacks, but the combination of their experience and outside viewpoint makes 
their advice very valuable indeed. 


It is important to view software engineering, at least the coding aspects, as a science 
rather than a religion. In religion, you're convinced you're right and the goal is to 
convince your opponent that he is wrong. In science, you want to be right and you 
examine eagerly your colleges arguments in the expectation that you will learn 
something new. In religion you know that you're right, in science you want to find out 
what is right. Top-level architecture issues and lowest- level lexical style issues can be 
religious. The top level architecture is not at issue during a code review and the low- 
level lexical style is arbitrarily declared in the Microsoft Standard. Otherwise, we're 
all working towards the same goals of correctness, structure, size and speed. By freely 
discussing the pros and cons of the alternatives, we should nearly always reach a 
consensus on the correct decision. 


As mentioned, its easy to take a change to your program as criticism of yourself as a 
programmer... especially when you are clearly wrong. Although it is difficult to argue 
because you are clearly wrong, such a viewpoint makes one defensive and less receptive 
to the remaining issues which may not be so black and white. Its important to 
remember that humans are biological creatures and have a certain innate error rate. 


It helps to consider yourself a “coding machine" that you have designed, built, and are 
continually upgrading. If you had such a machine, you would measure the quality of 
the items the machine is creating, both by direct measurement and by reading the 
letters received by the customer service department (the code review). When a 
defective unit is found, it is not taken as a personal criticism. Instead, the defect is 


dispassionately analyzed: 


Was there bad raw material (Poor specification) 

Was the machine built incorrectly (Bad design) 

Was the process defective (Incorrect or inadequate tools) 
Was there an error in quality control (Poor testing) 


After this analysis is complete, two steps would be taken: 


The source of the error would be adjusted to reduce further errors 
Tests would be run to make sure the problem was corrected 


You wouldn't take returned or defective units as personal criticism, you'd take pride 
in paying careful attention to such units so that you can reduce their number to the 
absolute minimum. This is exactly how you should treat yourself and your own error 
rate. No human will ever have a zero error rate, but you should treat each error as an 
opportunity to trim your rate even lower. Its OK to make errors, its not OK to be 
negligent or to make careless errors. If you do your work carefully, ‘criticism’ of your 
work is not criticism of yourself; it is feedback to help you become even better than 


you are. 


How to Maximize the Benefit of a Code Review 


One of the nicest things about a code review is that it can improve code quality before 
it takes place. Since you know what kind of code the reviewer is going to demand, 
you'll naturally write it that way in the first place. You'll discover that structure and 


ae 


documentation techniques designed to help others work with your code also help you 
to work with it, as well. Many times, when working on my code, |’m in the process of 
structuring and commenting it to demonstrate to a future reviewer that it covers all 
the bases when | discover that it doesn't cover them after all! Likewise, the strong 
documentation and design standards are a great help in desk checking, debugging, and 
later using the code for further work. 


My favorite technique is to envision a hypothetical reviewer who is somewhat 
obsessive. I challenge myself to write code that such a reviewer can find nothing to 
complain about. In areas where | think I've done a just good enough job, I go back 
and touch it up so there will be absolutely no grounds for any adjustment. At first 
this process takes conscious effort and extra time. However, it soon becomes habitual. 
The large benefits which result far outweigh the rather small initial overhead of 
producing clean code. 


As I emphasized above, it is important that you take the review comments as 
feedback, not as criticism. The reviewer's job is semi-mechanical; The code is double- 
checked against the ‘design rules’ (coding standards) and the understandability is 
evaluated by the acid test of trying to understand what the code is doing. The report 
to you will be a simple, non-critical measurement on how well your program does or 
doesn't meet these goals. 


Sometimes a reviewer's comments are inappropriate because of failure to understand 
some aspect of your design and therefore suggested changes are inappropriate. Such a 
case is still valuable feedback. If the reviewer misunderstood your code then it was not 
structured and/or documented clearly enough. Its like a beta-tester for your new 
"foobar Model X" circuit board plugging it in backwards and blowing it up. You don't 
criticise the tester, you say "We are really lucky that problem turned up in beta-test 
and not after production!" 


Hints for the Reviewer 


Be Objective. 


Read the above discussions for the author, and invert. Remember that until the 
author becomes accustomed to the review process, there will be a tendency to take 
comments personally. Consider the author's state of mind as the review is read and 
word the review to allow the author to accept your suggestions with a minimum of 
bruising. He'll be more receptive and consequently more productive. 


When Uncertain, Ask 


A common reason for review comments is that you don’t understand something, or a 
code segment appears superfluous but you're not sure. In such cases don’t say "this is 
wrong, it should be like this...” but ask the author what the situation is and perhaps 
discuss a couple of alternatives. If one of your possible interpretations is the right one, 
you'll save a ‘turn around’ on the review. For example (discussing a macro paddr): 


What is the purpose of ‘paddr’? Can't you just have 
the assembler/linker give you the ‘segment #’ of the 
headers and avoid the arithmetic? If not that, how 
about just computing the values for the free and busy 
chain and storing those somewhere? 


As I emphasized above, the review process should be seen as a technical discourse 
about a scientific subject. When you want to make a point but you're not 100% sure 
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that the point is correct then use a phrase such as "I argue that..." or “I believe 
that...".. These phrases are important because they communicate the distinction 
between your belief that something is so and your statement that something is so (and 
can presumably prove it.) 


Suggestions vs Demands 


Pay attention to how you phrase your comments, be they ‘suggestions’ or ‘demands’. 
A suggestion is better received than a demand. When the degree of sin is small, a 
suggestion is definitely called for: 


Routine gethandle: isn't this a misnomer? Isn't 
the function of this routine to get an address? 
I'd call it CHA (Convert Handle to Address) 


In this case a new name was suggested because the key item was the fact that the 
current name was wrong. The exact form that the new name will have is less 
important. 


The reviewer must adjust to the author. You must be careful to not go overboard in 
making ‘general suggestions’ and risk having the author not appreciate or understand 
the changes that have to be made. When you're not sure that the author will get the 
hint, spell it out a bit more clearly. More on this later. 


When you make a correction, discuss why it is better. This is important because the 
author is supposed to be using the review feedback to improve skills. Since we're 
talking science rather than religion, you should back up your suggestion with the 
underlying justification so that, in case the reviewer is wrong, the author can counter- 
argue. 


When commenting on a religious issue, there may be no ‘scientific reason’. In that 
case, just "quote the little red book". Examples: 


Note warnings about not removing block from free 
list, re-entrancy, etc. Stuff like this is 
necessary because slipups in reentrancy will give 
us bugs that it will take years of work to find. 
That means it is worth the investment of ‘overkill’ 
with regards to documentation, warnings, etc. 


Comment sub-banners should have their ";" at the 
left and be surrounded by white space. Major conment 
subbanners have a ";*" in col 1 and 2 blanks before 
them. So sayeth the book. 


I convoluted the loop at gallocl:; since ‘taken’ 
jumps are very expensive this is actually faster 
in both the found and not found cases and is 
smaller as well. 


Note that the last example was addressed to a beginner at 8086 assembly language 
coding and therefore explained something that might not have been obvious: the 
timing difference between taken and not-taken jumps. If the author were known to be 
an experienced 8086 hacker such detail might be offensive. The comment would then 


have read: 


a 


1 convoluted the loop at gallocl: for speed. 


/HU “Repeat Some" 


Don't repeat an identical comment over and over as it applys to different lines of code; 
it gives the impression of “counting coup”. On the other hand, don’t just say it once 
and assume that the author will think to make all the corrections. Repeat the 
comment a couple of times, perhaps, then make reference to the ones you're omitting: 


Again, signed compares should be unsigned compares. 
Check all jumps, I have likely missed some. 


Show Improved Examples 


Its often more constructive to show what you suggest, rather than just talking about 
it. For example: 


Note that instead of comments like: 


mov cx, es ; gave the pointer 
You can say 
mov cx, es - (cx) = segment of block 


which "refreshes" the reader's memory more. Its 
also easy to look back and see what the contents 
of (cx) are. 


Often its convenient just to make the proposed edits to the source under review, then 
refer to what and why in the review memo: 


I convoluted the "space big enough” jumps to reduce 
the number of instructions in the loop cause it’s 
inner-most. Also, I'd try making the sizes of the 
special head and tail nodes FFFF so that I could 
shorten the loop still further and just say 

"is this big enough” and if the answer is yes, then 
test to see if we've actually run to the end. 


In the above example the reviewer ‘demo-ed' one proposed change and just described 
another which involved more sweeping changes. 


Suggest, Rather than Command 


Phrasing your comments as suggestions helps defuse ego-involvement. In return, the 
author must not ignore such suggestions but give them full consideration. The author 
should either take the suggestion or counter-argue the point with the reviewer. Don't 
just let the suggestion ‘drop on the floor’; close the loop by informing the reviewer 
which suggestions you are passing up, and why. 


Suggestions that involve alternative ways of writing something are best presented as a 
synopsis of a general technique, followed by pointers to instances in the code where 
this technique might be applied. Since it’s been presented as a general technique, the 
author will be much more likely to use it again in the future. 


If your suggestion is more of a specific nature, try to word it as a question- "Have you 
considered foo?" Elaborate a little on the tradeoffs that you see. 


Don't Over-Specify 


When you point out a problem whose proper solution is clearly within the capabilities 
of the author, don’t spell it out in insulting detail. Do be sure to discuss it fully, but 
when you can, leave the details to the programmer. For example: 


If a guy keeps trying to shrink his segment by 
something less than minsize it will never get 
smaller. We need to keep both the TRUE size and 
the REQUESTED size of the block and reduce the 
REQUESTED size. When that becomes an allocation 
unit smaller than we can calve off a free block. 
We also need the requested size for handling the 
‘sizeof’ call. 


Don’t Under-Specify 


Worse than over-specification is under-specification. You must make your comments 
sufficiently clear and precise so that the author can clearly understand you. Don’t let 
your suggestions flunk their own review due to vagueness or incompleteness. When 
you've just sketched out a solution, make sure you emphasize the inadequacy of the 
sketch. For example: 


As I understand it, we're looking to see if the 
block following a block is free. Why not just 
look? We know its header address. I changed 
the code in grealloc to do just this. One bug 
I’ve introduced, though, is that we'll have to 
‘terminate’ memory with a special block marked 

0 length and in use. Don’t need to put it on 
any chain, but it has to be there in case we try 
to extend a block which abuts high memory. 


Naturally, the level of detail necessary depends upon the author's skill and familiarity 
with the techniques you suggest. Better too much detail than too little; the author 
should understand that you're just making sure and not take offense. 


Give Positive Feedback 


Finally, after pages of suggestions and discussion of techniques to correct problems, 
don't forget to put in a little positive feedback. Its not the reviewer's job to embed 
comments in flattery - the purpose of the review is to point out areas that could be 
touched up. The author understands this and is not expecting a lot of ‘stroking’. 
However, when you see something has been done well, spend a few bytes of email and 
say 30. 

As I alluded to above, the sincerest form of compliment from a reviewer is a a short 
review memo. If you receive only a few comments and those comments are minor then 
the reviewer is telling you that he doesn't see how it could be done better, and thats a 
true compliment. 
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Appendix 


This appendix consists of excerpts from a real-world program which adheres closely to 
these coding and design specs. Look this over and see how well you can understand 
what these routines do and how they do it. Imagine yourself being given the task to 
make some change or extension to this code: how quickly could you find the area(s) to 
be modified, how easily could you determine what the changes should be, and how 
likely is it that your changes might inadvertently break something else? 


Some specific notes: 


Page 1 & 2 


Unfortunately MASM requires a lot of obscure declarations at the beginning 
of the program. Note that effort has been made to split this stuff up into 
managable hunks. Also note the initial banner which makes it clear, literally 
at first glance, what these routines do, and why. 

Page 3 


Note the use of “illustrations” to make a record's structure clear at a glance. 
Note that the programmer has gone to some effort to keep the 
communications density high while retaining clarity. 

Page 4 


‘The "NOTE:" comments point out special assumptions, requirements, or 
other items of importance that may not be directly obvious. 
Page 5 


The "NOTE: comment here anticipates a potential problem for future 
maintainers and spells out the hidden peril. Again on page 7 a NOTE 
describes a situation which, although perferectly legitimate, might cause 
problems for future maintainers who might not pick up on the possibility of 
trailing garbage after the 00 byte. 

Page 9 


The author writes down his "proof" that this routine will not orphan a 
"name" record... to ensure that future changes here will not cause such 
orphans to become possible and to provide a signpost for reviewers looking 
for potential problems. 

Page 15 


Chk Block must be fast; the author explicitly notes this requirement. 
Page 20 


The share tables are very complex and prone to human error as well as 
future changes. The author has gone to a lot of work to try to make them 
understandable so that he and others can verify the correctness of the tables 
and have a ghost of a chance of correctly changing them in the future. In 
this case the density and complexity of the table called for exceptional 
documentation effort. 

Page 22 


The code at "cuc8” may jump-up to cuc20. The code structure is still clean 
and understandable, so the author elected to note the “gotcha” rather than 
restructure the code to remove it. 

Page 27 


Note the WARNING:. Hidden among the algorithems is a meta- algorithem 
which terminates the “can’t find space; garbage collect; look again for space” 


loop. Its important that the author prove to himself and his reviewer that 
this loop completes; he also prevents a future maintainer from accidentally 
introducing such a bug (which would probably get through testing and out 
into the field). 
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Nov 6 07:81 1084 
BREAK <MFT and Lock Record Data Area> 
36 First MFT record 
; Note that the nase field can have garbage after the trailing 
: 0O byte. This is because the field sight be too long, but 
: not long enough (at least 16 extra bytes) to fragsent. 
; in this case we copy the length of the string area, not 
; the length of the string and thus may copy talling garbage. 
lar: 
DB 0 > free 
DW 490 ; 400 bytes long 
DB 487 DUP (0) ; leave rest of record 
MEND DB =} >; END record 
' 
lcki DW 0 ; link 
DB SIZE RLR_entry-2 DUP(0) 
1ck2 DW ick! > link 
DB SIZE RLR_entry-2 DUP(0) 
1ckS8 DV 1ck2 > link 
| LB SIZE PLR_entry-2 DUP(O) 
ick4 DW 1ck3 > Link 
DB SIZE RLR_entry-2 DUP(0) 
1ck6 DW 1lck4 > link 
DB SIZE PLR_entry-2 DUP(0) 
ilcxé DW 1ck5 ¢ link 
| DB SIZE RLR_entry-2 DUP(O) 
jLck7 DV ick6 ; Link 
DB SIZE RLR_entry-2 DUP(0) 
1ck8 DW 1ck7 > link 
DB SIZE RLR_entry-2 DUP(0) 
'Frelock Dw lck8 ; Ptr to lock free list 


' 
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BREAK <Sharer - MultiProcess File Sharer> 
MSDOS MFT Fanctions 


e 
* 


The Master File Table (MFT) associates the cannonicalized pathnasecs, 
lock records and SFTs for all files open on this sachine. 


These functions are supplied to saintain the MFT and extract 
information from it. All MFT access should be via these routines so 
that the MFT structure can regain flexible. 


Bees Bs Bs Oe Bs we Oe Oe 


BREAK <Nft enter - Make en MFT entry end check sccess> 
sd aft_enter - make an entry in the MFT 
aft-enter is called to sake an entry in the MFT. 


aft_enter checks for a file sharing conflict: 
No conflict: 
A new MFT entry is created, or the existing one updated, 
&5 appropriate. 
Conflicts: 
The existing MFT is left alone. Note that if we had to 
create a now MFT there cannot be, by definition, sharing 
confiicts. 


ENTRY 
contains the desired sharing sode. 
Path_naze contains a long pointer to the full pathname for 
the file 
Uid = 16-bit user 1d of isster 
Pid = 16-bit process id of issuer 
(CS) = DOSGroup BUGBUG - installed 
{OPEN DEVID]) = set appropriately 
If FILE 
(CURBUF+2] :BX points to start of directory entry 
([THISPRY] sot to drive nuzber of file. 
If device 
LDEVPT] = hes DWORD pointer to device driver 
EXIT *C* clear if no error’ 
*C® set if error 
(ax) = error code 
ALL but DS 


me Oe Me Oe Re ws Ws Be Me De Be Sa Gt Ss Ss Se Ve Se SF SF Se Ga Se Bi Be Be Be Bs Bs BH BE 


ThisSFT points to an SFT structure. The sf_node field 


USES 


Procedure aft_enter. NEAR 


| 
ae eee ene eee, 
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sigaiales ES : KOTHING 


push ds ; preserve (DS) BREAK <MftClose - Close out an MFT for given SFT> 
; find or sake a nase record 3&8 MFTclose — Close an SFT/MFT Entry 
lds si,Path Nase ; (DS:SI) = FBA of file nase : MFTclose (SFT) 
DEBUG 1,6,<MFT_ENTER 1:6 ~ $x: $x Naze=$s\n>,<LS, SI, DS, SI> ; 
Rov sl, 1 ; allow creation of MFT entry : MFTclose reaoves the SFT entry from the MFT structure. If this was 
push es ; the last SFT for the particular file the file’s entry is also reaoved 
ASSUME DS:NOTHING ; from the MFT structure. 
call FNM ; find or create nase in MFT ; 
POP ee ; Note that the SFT refcount has not yet been decremented by our caller. 
nov ax,error_not_enough seaory : A refcount of 1 means that the SFT is going idle 
LJC enta ; not enough space ; 
- DEBUG 1,6,<Survived FNM ($x)>,<bxr> : ENTRY (ES: PI) points to an SFT structure 
5 (DS) = (CS) = DOSGroup 
3 add the SFT to the chain ; EXIT NONE 
: ; USES ALL but DS, ES:DI 
: (bx) = fva name record 
Procedure ¥FTclose,MNEAR 
Ids 81,ThisSFT ASSUME ES: NOTHING | 
call ASC ; try to add to chain gov wx,ec: (43) .6f MFT | 
or ax,ax : 
: As noted abcve, we don’t have to worry about an “eapty® naze record jz pcl10 ; ; No entry for it, ignore (°C clear) 
; being left 1f ASC refuses to add the SFT - ASC cannot refuse if we had push és 
1s just created the MFT... push es > preserve regs 
push di 
return. call COL 3; clear SFT locks 


- ASSUME DS:NOTHIKG 
vs °C’ and (Ax) setup appropriately 


jent@: push 


{ 

| | 
xor cx,cx 

| popf | 
| jne enti0 > (CX) = 0 12 OK 

not cx ; (CX) = -1 if error | 
rentiO: DEPUG 1,8, aT ENTRY: carry = $x ax = ' 
pop = da 
return 

‘EndProc wit enter 


| 
| 
$x\n>,<cx,ax> 
| 
| 
i 
| 
| 
| 


oe re ee ee ee ee 


a eee —— 


x 

_ O1daga 
o+14de8 at1° [78] Yld quon‘xp 
a O1393a 
I3d@ 41° [ts] yg quOn’ tp 
TT02@ 
T8‘Ts 


810m ou 


I9quno> reato 
SS0IDpe poder y207 = (IS:Sq) 


xO'x) 
qadt{ 330 (78) ‘ts 
xe‘sp 

_ So’xe 

LAW 78° [1T8] ‘Ts 
T8‘Tp 

sp’xp 


eae 


S8aIpps Jus = (Iq:xp) : 


L4S JO sseippe - 
S82Ippe razjng = 


SJIOT 049 4uNnOD B407 


881T3 9lom ou s0r1za‘xe 


ut 
zuf 
dup 
zuf 
dus 

zf 


Io 


qns 
Ao 
Aow 
AOg 
AoW 
Aow 
Aou 


(IS:sa) 
(SOL: 89) 


“S3UGA BY Las ey 408 9A,9m 


qe2 
2938 
AO”U 


TOITO /A UINIeT —- ee, 004 sen X9puT 14S IO oTTZ sy] 


SSaIppe Iezznq = (TP: 89) 
10M OHOs UTBYD AOTIOZ 


TP 

91038 

T8‘xe 

= Sp’xe 
UTtys 48° [78] ‘Ts 
xo 

= 63938 
rads 43m" [Ts] ‘ts 
x2‘xq 

Ts 

91038 

Te’[e 


POJUBA BA Las 8Y4 payoeer GARY 
{UNO UTeYys Jus = (Z5) : 
SSO9IPpe prodsz eueu — (Is:sa) : 


T0330q 8,IeTTVD OguT owen {doo : 


emeU Ize'Ts 
3®8JJO sTqeq oves : ts 


tP 


19330q S,I9TT@D Jo Bseippe — 
£1400 Laid = 


dod 
zuf 
Io 
AOB 
SPT 
ep 
zxof 
BPT 
Syox 
dod 
zuf 
pug 
qs048 
QspoT 
ppe 
ysnd 
ysnd 


(Id: sq) 
(Is:sa) 


XO8DUT Lys = (xq) 


SI e3eg 


v86T 98:40 9 AoN 


:6993q 


© 
» 
© 
bo 
a 


* 
* 


799028 


[9903 


plage 


PIODaI omeu dtys : 
fddey st zettTe> : 
I9TI® BeTjeTI@WwW9 


aNa st : 


LAW LASd40 JO taz = (T8:sp) : 


3NO BT XOpuy Jays Jt ,seTTz er0N ou IO120,) 


SUT OTT 44.xX@ ITA Ut Pp 


8UQ pues suru 38U9 SuInger 47 


xB‘sp aoe 
S)'x2 ,o8 
X@PUT OTTZ = (xd) : xo‘xq Syort 
ONTHION:S3‘ONIHION:sq annss¥ ty 
UVEN'I097 gy 038P?° 
_(e3ue2 zo . 
®po> JoIz9 = (xe) : 
IOlI9 J} 498 ,5, , 
14S BTA BAD0T JO g = (xD) : 
14S JO pt Jesn = (Xa) ; 
®TITZ ST I9Zznq Iq:sq ® 
JOII@ OU JT IeOTD ,5, ihe Hl 
dnoznsoq = (ss) ; 
OBEU OTTZ IOZ 9zznq 04 guUTOoOd (1a: sa) 
XOpUT 14S pesuq-orez = (x5) 
XOPUT GIT3 peseq-ozez = (xg) xruiwe 
“LHS YBY4 Bla otTyz 9849 wo BYD0T jo 
iequnu ey. pue oTTz 38YQ UO Jus U9.(X9) cya zo qin 
*(pesomoid st Sutzepso0 = 
TeTNITIIed ou) geqT oy. ut eTT3 U3. (Xd) 949 seqeooT q8273 403°, a 
‘shetdstp snaeqe e2npolid 04 s4TTTqedeo sty. oe = 
“LAW 849 Bory uot QeMIOZUT UInIeZ 04 pasn BT 403 aH ge 


S2TITITIN wages 


14H 9Y9 Wor Lrque ue 409 - 192 ta 2 
<14W 94 worz Lx4u0 ue 903 - 903 aa 


“ONRU OTTZ O42 pegesoT 01,08 


13038 ions dE pg ot 
UST 338° [Ts] ‘Ts ppt 
x2 20P 
p1920 zat 
8tU3 J} 908 ‘omeu zoyqor"e oatt 
20030 sf 
$2928 zf 79030 


T-"3etz a3a° [ts] 4803 
OueU Qx0U TTQUN preAzoz uW?8 


Law: dnowsoq 1asaso'ts aoe 


9 AON 
Lt 83%g = BGT gen 


yy2q TItt Uttt Itto tt0t a: an 
y32q TTTT TTT OTtt ttOt uw ua | 
yy3p THTt TIT TITT tOTt auA a: 
yJap TrTT ttt TTOT TOTT AAG : 
pgp ott ttt0 ITTT TOTT wAad : 
J33E FETT TTT TIED ELE munua =: 
J333 TETT TTT TITT <ttt a mua: 
JE} TITT TIT TET. SLE unua =: 
gxat TET PENI TTIT TItt ma}: 
JIFE TTT TItt ITTT Bttt ey 5 * 
psp tott Ttt0 Ittt tott 2 
LETT TET FIT Tttt <x ! 
AX AMA OAM OA A AS y BUTINSsellV ¢ 
uuxuw w wa \ peaysso22V ¢ 
AMAA IAA >= 88929¥ 387 eatamfued =: 
wud peoyfueqd : 
x992 aaaad aacd qaaa / nedaop/kueq 
000 TtlZ sess StF ‘ 
eatin/peed SuON fued : ysFFTO Ad 
eatin ouon Aued : ep s-1010) ad 
pray ouon fued - yPL2TO nd 
e4tin/Peeu y fued : yz33490 raxet 
O4TIM y fused : ysFL90 Ad 
preu yw Aued : y332q00 aa 
estin/Ppeeu m Aued : uszFsPO na 
eatItn a fued : Us3aPO ad 
prow a Aued : UPLIPO nd 
eatin/peou =A/a fueg : Y3zFF0 AG 
eatin A/Y kueqg : UrJ3F0 Ad 
peow n/u fue : Usy3ss0 ad 
9aTIn/Peeu gzdgop : UzzFIIO ad 
PL aes, qazdmop : UssF30 Ad 4037 
prow qedmop : UPLFPO aq :Wond Law s0zgpuq 


gasaxt: TIRE <=.) “88925 AOTT® <= 202 
Qo ‘eanttes JO ggo2ons oyd Bo UBOTPUT yotatsod xeput PTO aya ut 37a SUL tp ou 
ain = (x9) : ain 38° (TP) "x4 tae 
7 qIyd 34 398 .9. xp'sp ™ 
‘y4xeput pto fq [xeput Mou]oTaes 359T FFTUS : 408 :tr909m 
nov+(EeHS) 40 XOPUT pandwoo ‘opos pro pus Aew 103 ‘ sseipp? 14S = (TP:xp) 
° ssoippe 2933nq = (SO . ‘ 
: guyTIo3Z1V ; sy20T FO QuUNOD ie ‘ 
etqea JTTZUOD ESN osuy oyy winger *6AIOT 3UTIUNCD ouoq | 


<sqotTzuoa eBesn FOU? — 9n>> wwaud e103 1YOHS ie 


qxou 111° [18] ‘t8 ry 


‘01903q) 


og o8ea yest 98°40 9 AON 61 838d 883 g¢:75 
“409 AON 


a ee Ha ll 5 relate 


Nov 6 07:85 1984 Page 21 Wov 6 07:35 1084 Page 22 


or ax,si 
- DR RW 1011 1111 1111 2111 drt LJZ cuce ; at end of chain, no problessa 
° pR 0001 1100 O111 1101 i1c7d OV al, BYTE PTR [si] .sf sode; (al) = mode byte 
: op wW OOOO 0011 1111 1111 oO3ff DEBUG 1,8,< chk $b/$x mask $x -- >,<ax, cx, dx> 
> op sRW OOO1 1111 1112 1111 1fff 
' call cai > compute the share index 
- In order to allow the greatest number of accesses, compatability read mode ine ax 
: 4s treated as deny-write read. The other compatability modes are treated ine ax 
: ps deny-both. xchg al.cl (cl) = shift count 
gov ax,dx 
sar ax,cl ; Belect the bit 
DEBUG 1,8,< $x°$x >,<ax,dx> 
#6 cuc - check usage conflicts je cuc8 ; @ conflict! (probably) 
: CUC is called to see if a would-be open would generate a share ; we may come back bere from "cuc8" if the conflict is a false alars... 
: conflict with an existing open. 
; See CUCA for the algorithm and table format. > lds si, (si) .sf_chain 
JMP cuci > chain to next SFT and try again 
: ENTRY (5X) = FBA MFT name record 
: (DS:SI) = SFT address ; Have a share conflict 
: EXIT °C° clear if OK 
°C’ set if conflict cuc8: cap open_devid, 10000000b > If the file is a device that has been 
: (ax) = error code jz cuccon ; Opened in deny none mode and the 
; USES ALL but argusents (BX, DS:SI) gov al,byte ptr (si].sf_mode; current open is in compatibility 
test al ,01000000b 2; mode, allow the open. 
cuc: BOYV ax,ds jz cuccon ; Otherwise, report the conflict. 
gov os,ax mov al, byte ptr es:(di].sf node 
gov di,si ; (e8:41) = FBA SFT record test al,11110000b ~ 
gov al, PTR [81).sf_sode; (al) = mode byte jz cuc20 
mov ch,al 
and ch,sharing sask ; (ch) = new guy share cuccon: mov ax,error sharing violation ; assume share conflict 
jz cucd >; Rew guy 1s compatability sode DEBUG 1,8,<CUC - SHARE VIOLATION>,<> 
gov ch,sharing mask stc 
cucO: call cel ; compute share index 
add ax, ax *2 for word index : done with compare. Restore regs and return 


xchg ax,ei 


8 (si) = share table index 
gov dx,WORD PTR CUCA[s1] 


(dx) = share sask 


eevee are 


°C’ set as appropriate 


ee ae ee ee ee 


mov ax,cs (es:d1) = new SFT address 
gov ds,ax ; (ds:bx) = FBA MFT record (ax) set as appropriate 
lds e1, (bx) sft sptr ; (si) = first SFT guy (bx) = MFT offset 
ready to do access compares. cuc9: s0ov cx,es 
Bov ds,cx 
(da:01) = address of next SFT mov si,di 
(es:di) = address of new SFT ret 
(dx) = share word from CUCA 
(bz) = MFT offset 
(ch) = 0 1f new SFT is compatibilty mode, else sharing mask 


BREAK <csi - compute share index> 
cuci: s0ov ax,ds 


I9YQINZ FOOT - Yooea V St guuz 2f 


geredao> sens op : ane 330° [18] ‘Tq dus 

at dtus genf{ ‘9023 ST : puuz zf 

punog jou oueu -— pus ae : _ oer sf 
yao’ Betz 339° [18] 4502 39«_:zuuz 

Lan L3S4s0° 8 Ace 

xe‘'sp AOw 
gao> »as’x¥ AcB : 

82'xt AOS 


asoippe Zuti3e omen = (SO1:Z+S0L) 


gunoo 034q = (xp) ‘ 
a4f4q ans = (TQ) i 
Bez 098079 = (YQ) : 


48TT owen ZuTyorVes I18IS *pegndgo> o;Ul 


IEYD [TAT 104Je egentar0g : jauz zuf 
te’ ts pus 
xp oUt 
Te’Tq pee 
reqs qxou = (18) : qspot_: taus 
wns = (19) : Ta‘Tq qns 
qunod a44q = (XP) : xp’xp qn 


Qo2 
Suyaans pue Zupyaunos Juts4e qsnoryy wAop was $ onaaa 
que 
Betz oqzor>d = (4a) * Te’q@ TD 99¥ + SeHS = : ppe 
te =. yan S*1VAHS = (TS) * T2°Te poe 
ssorppe Zuti94e oats 8p yend WN I'ts 246 
Z*TVAHS = (T9) : 16 ef) AoB 
TW sasn : T’°Ts ays 
epod so1I9 = (x*¥) : ‘ts x48 
egoeds jo ano @e “O980r1D 02 JT : _ te 349 
poSuveyoun (1$:Sd) : HOJO-Asve Buzseys = ZN 
punoz 9ou moat ‘OQtOTD 09 10T JT : Be4Tq oteys = (18) : youve Suyseys' Te pue 
JOIIa Jt 108 dD. 831q sso22e = (YB) : yews seol0e’ ye pue 

poygera IO punoy JT J@9T> .d. LIXg Te’ qe con ees 


eBTAIOYIO O = 
B4sTxXe WOU ZT P1OI0I O4¥eID 02 T = (T) 
(zpase*) But198 owed 09% soqutod = (1S:Sd) XULNa 


“p1oz9r meu B IOJ LAW OYI SoqoIVes WNA 
(xopuy 680998) + Es(XOPUT ores) 


ee ee ee ee ee ee 


LaW UT OMBU PUTA - HN 


e 
* 


[¥T 0% O BOIZ XoOput us O3uT o9fq Opow B Suny Ts2 
<LaW UF WNT PUTA - WNA> NVAUE 


ed 


xoput oreys egndso> - {82 


* 
* 


eZ e8eg pect 96:20 9 4ON 6% 038g 861 96:20 9 AON 


Nov 6 07:85 1984 Page 25 


fnp4: 


fnab: 


add 
JMP 


s1,[(si) .mft_len ; not a match... skip it 
SHORT fna2 


narne checksums match - compare the actual strings 


(dx) = length 
(DS:SI) = MFT address 
(bh) = create flag 


(b1) 


sus byte 


(dx) = byte count 
(TOS+2:TOS) = name string address 


cx,ax : (ex) = length to gsatch 
ai 

es ; (di:es) = fba given nase 
es 

ai 

si 3; save MFT offset 


si,mft_nase ; (de:81) = fwa string in record 
cepsb 

si ; (ds:51) = fwa nase record 
fnea4 ; not a patch 


Yos, ve've found it. Return the info 


(TOS+2:TOS) = name string address 


Pop 


pop 
gov 


elc 
ret 


ax 3: Giecard unneeded stack stuff 
ax 
bx, ei : (de:bx) = fwa name record 


Its not in the list - lets find a free spot and pvt it there 


(db) 
(b1} 
(ar) 


(TOS+ 


(ds) 


and 
jnz 
pop 
pop 
stc 
gOV 
ret 


a 


NO th ts 


crests flag 
baum oyte 
string length 


:TOS) = ASCIZ string address 


SEG CODE 
bh, bb 
fngi0$s 3 yes, insert it 
ei 
és : nO insert, its a “not found® 


ax,error_path_not found 
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fnn10$5: add dx,aft name ; (dx) = minigun space needed 


fnati: 


ee we we Be Oe 


mo ee ee we 


fnals: 


61,OFFSET MFT 

{e1] .wft_flag,OFFh 

fng20 at END, as out of space 
fngi2 ; is a free record 
s1,(si] .mft_len ; skip nage record 

SHORT fnail 


ox tetlenee tee Have free record, (ax) = total length 
ax,dx 

fnais ; big enovgh 

o1,ax 

SHORT fnail >; hot large enough - move on 


OK, we have a record which is big enough. If ite large enough 
to hold another nase record of 6 characters than we'll split 
the block, else we'll just use the whole thing 


(ax) size of free record 

(dx) aize needed 

(de:si) = address of free record 
(b1) = sum byte 

(TOS+2:TOS) = name etriig cisress 


sub ax,dx ; (ax) = total size of proposed frageent 
cup ax,eft_nane+é 

jc fne1é ; not big enough to split 

push bx ; gave sum byte 

gov bx, ax ; (ox) = offset to start of new naze record 
mOvV (bx] (s1] att flag .MFLG_FRE 

gov (bx] {1} .eft_ len, ax 3; setup tail as free record 
sub ax,Aax ; don’t extend this record 

pop bx > restore sus byte 

add 6x,ax ; (dx) = total length of this record 
mov (61) .wft_len,dx 

gov {e1] .2ft | “sus,bl 

mov (ei) .wft_flag,MFLG NAM 

mov ax,ds 

Bov es ,ax ; (es) = MFT segnent for "“stow* 

sub ax, ax 

mov di,si 

add di,aft_sptr 

stosw 

stosw ; zero SFT pointer 

ERRNZ sft lptr-sft_ optr-4 

etosw ; zero LCK pointer 


we're all setup except for the nase. 
Note that we'll block copy the whole name field, even though 
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= pointer to first SFT in SFT chain 
LPTR = pointer to first record in lock chain seguent 


BREAK <MFT Definitions> 
is MFT segaent 
.° DOS MFT definitions DIRSEC = The directory sector nusber of the file. 
DIRPOS = The file’s position in the directory sector. 
The Master File Table (MFT) associates the cannonicalized pathnases, DRVNUM = The file’s drive nuaber (A=0). 
string = nage string, zero-byte tersinated. There 


lock records and SFTs for all files open on this machine. 
may be garbage bytes following the 00 byte; 


The MFT implesentation employs a single semory buffer which is used these are counted in the LEN field. 


from both ends. This gives the effect (at least until they run into 


each other) of two independent buffers. NOTE 1 : The fields DIRSEC, DIRPOS, and DRVNUM sake up & unique id 


for the file. 


NOTE 2 : The MFT records for devices set DIRPOS:DIRSEC to be a ptr 
to their device driver and DRVNUN is set to FFh. 


MFT buffer 


The MFT buffer contains MFT nase records and free space. It uses & 
Classic heap architecture: freed nase records are marked free and 
congloserated with any adjacent free space. When one is to create & 
hase entry the free list is searched first-fit. The list of nage and 
free records is always tersinated by a single END record. 


MFT free records 


wwvsevrwweswwe2 eww oewrwerreror” 


LOCK buffer 


record type flag 


The lock buffer contains fixed format records containing record = 
LEN = total byte length of record. 


locking information. Since they are fixed format the space is handled 
as a series of chains: one for each MFT name record and one for the 


free list. 
MFT END records 
Space allocation 
Setssrsssssscscses 8 
| te vere en 
| FLAG | 


The MFT ic managed as a heap. .3y blocks are allocated on a 
firot-fit basis. If there is no single large enough empty block the 
list is garbage collected. 


FLAG = record type flag 


NOTE 1 : The last 8 fields of the record sake up & unique identifier 
for the file. ‘ 


NOTE 2: The MFT records for devices set DIRPOS:DIRSEC to be a ptr 


MFT name records: 
Unique file id 


fo ye 
‘ 8 16 16 32 16 16 16 8 n to their device driver and DRVNUN is sect to FFh. 
a ee ae ee a aaa ellen ee oot 
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| FLAG | LEN | sUM | SPTR | LPTR | DIRSEC | DIRPOS | DRVNUM | string | 


er are 
Ce ee 2 ee ee 
ee ee ee ey 
i ee ee ee ee ee ee ee ee ee ee ey 
ce ee wee ae Oe Oe 


FLAG = record type flag 3#¢ MFT definitions 

LEN = total byte length of record. 7° 

SUM = sum of idenitifier fields. Used to speed 7* NOTE: the flag and length fielde are identical for all record types 
searches 7¢ (except the END type has no length) This sust remain so 35 
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ee some code depends upon it. 
7 NOTE: Many routines check for °n-1" of the N flag values and if no 
oe match is found assume the flag value must be the remaining 
i. possibility. If you add or remove flag values you sust check 
- all references to aft_flsg. 
MFT_ontry STRUC 
aft flag DB ? ; flag/len field 
nft len DW ? 
aft ous DW t ; etring sum word 
nft sptr DD t : SFT pointer 
eft Iptr DW t : LCK pointer 
oft dirsec DW t 3 directory sector 
aft dirpos DW ? ; position in directory sector 
aft drvnus DB t ; drive number 
nft_nase DB t ; offset to start of nase 
MFT_ entry ENDS 
MFLG NAM EQU 1 ; min value for name record 
MFLG FRE EQU 0 : free record 
MFLG_END EQU) ~1 ; end record 
;ee Record Lock Record (RLR): 
: 16 ; $2 ; 82 82 ; 
i  Geiahetaiensne! laactetecoiccncad (anrecratearenaing! bnnpelieerior as 
; | NEXT | FBA | LBA | SPTR | 
: ! lilo hi [| loha | | 
: CHAIN = pointer to next RLR. 0 1f end 
: FBA = offset of ist byte of locked region 
: LBA = offset of last byte of locked region 
: SFTPTR = pointer to SFT lock was issued on 
IRLR_entry STRUC 
rir_next DW ? ; chain to next RLR, 0 if end 
rir_fba Dw ? ; first byte addr (offset) of reigion 
7 DW ? 
rir_lba DW t >; last byte addr of region 
7 DW ? 
rlr_sptr DD ? > SFT pointer 
RLR_entry ENDS 
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MEND 
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<MFT and Lock Record Data Area> 
First MFT record 


Note that the name field can have garbage after the trailing 
OO byte. This is because the field sight be too long, but 
not long enough (at least 16 extra bytes) to fragment. 

in this case we copy the length of the string area, not 

the length of the string and thus asy copy tailing garbage. 


DB 0 ; free 

DW 400 3 480 bytes long 
DB 487 DUP (0) : leave rest of record 
DB -1 ; record 

DW 0 3; link 

DB SIZE RLR_entry-2 DUP(O) 

DW lcki 3 link 

DB SIZE RLR_entry-2 DUP(O) 

DW leck2 s link 

DB SIZE RLR_entry-2 DUP(0) 

DW ick3 ; link 

DB SIZE RLR_entry-2 DUP(O) 

DW ick4 ; link 

DB SIZE RLR_entry-2 DUP(O) 

Dw ick5 ; link 

DB SIZE RLR_entry-2 DUP(0) 

Dw 1ck6 ; link 

DB SIZE RLR_entry-2 DUP(O) 

DW 1ck7 3; link 

DB SIZE RLR_entry-2 DUP(0) 

DW 1ck8 ; Ptr to lock free list 
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BREAK <Sharer - MultiProcess File Sharer> 


* 
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es 


Procedure 
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MSDOS MFT Functions 


The Master File Table (MFT) associates the cannonicalized pathnases, 
lock records and SFTs for all files open on this aachine. 


These functions are supplied to maintain the MFT and extract 
inforration from it. All MFT access should be via these routines 80 
that the MFT structure can remain flexible. 


BREAK <Mft_enter — Make an MFT entry and check access? 


aft_enter - make an entry in the MFT 
aft-enter is called to make an entry in the MFT. 


aft_enter checks for a file sharing conflict: 

No conflict: 
A new MFT entry is created, or the existing one updated, 
as appropriate. 

Conflicts: 
The existing MFT is left alone. Note that if we had to 
create a new MFT there cannot be, by definition, sharing 
conflicts. 


ENTRY ThisSFT pointes to an SFT structure. The sf_ mode field 

contains the desired sharing node. 

Path nase contains a long pointer to the fall pathnasze for 
the file 

Uid = 16-bit user 1d of issuer 

Pid = 16-bit process id of issuer 

(cs) = DOSGroup BUGBUG - installed 

{OPEN DEVID] = set appropriately 

If FILE 
[CURBUF +2] :BX points to start of directory entry 
{THISDRV] set to drive number of file. 

If device 
(DEVPT] = has DWORD pointer to device driver 


EXIT °c’ clear if no error’ 
°c’ set if error 
(ax) = error code 
USES ALL but DS 


aft enter ,NEAR 
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